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We congratulate Morris and Tang for an interest- 
ing addition to empirical Bayes methods, and for 
tackling a difficult and nagging problem in variance 
estimation. The ADM adjustment appears to bring 
on interesting properties, not just in variance esti- 
mation but also in estimation of the means. In this 
discussion we want to focus on the latter topic, and 
see how the ADM-derived estimators of a normal 
mean perform in a decision-theoretic way. To facili- 
tate this we will stay with the simple model 

(1) y i \9 i ^N{9 i ,V), 9i~N(0,A). 

1. THE JAMES-STEIN ESTIMATOR AS 
GENERALIZED BAYES (NOT!) 

We first address the comment of Morris and Tang in 
Section 2.5, that the prior A ~ Unif(0, oo) is strongly 
suggested because the James-Stein estimator is the 
posterior mean if we take A ~ Unif(— V, oo). Profes- 
sor Morris has noted this before, and in the interest 
of understanding, we want to show this calculation 
and comment on its relevance. 

Writing y = (y u ...,y k ) and 6 = (9 1 , . . .,9k), the 
posterior expected loss from model (1), with the A ~ 
Unif(— V, oo) prior, is 



(2) 
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and factoring the exponent in (2) and writing B 
V/(V + A) shows that 

0\y,A~N{(l-B)y,V(l-B)), 

I \ fc/2 



(1/(2(V+A)))|y| 2 



V + A, 



The Bayes rules is the posterior mean, which we can 
calculate as 

E(0|y)=E[E(0|y,,4)] 

= E[(l-13)y|y] = [l-E(B|y)]y. 

We now, very carefully, calculate E(£?|y), yielding 

V \f 1 \ k/2 

. e -(l/(2(V+A)))\y\ 2 dA 



E(5|y)oc 



l/v 



l/v 



+ V f^e-Wftdt, 



where we make the transformation t = 1/(V + A), 
with the first integral coming from A G (— V, 0). Not- 
ing that the integrand is the kernel of a chi-squared 
density, we finally have 



(3) 



(| y |2)fc/2 



+ P(xl<\y\ 2 /v)], 



where x\ is a chi-squared random variable with k 
degrees of freedom. Since the chi-squared probabili- 
ties sum to 1, normalizing this expectation (dividing 

by ^^ST ) results in E(B\y) = V(k-2)/\y\\ 
yielding the James-Stein estimator. There are a num- 
ber of things to note: 

1. If this were a valid calculation, it would con- 
tradict such important papers as Brown (1971) and 
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Strawderman and Cohen (1971), which provided 
complete characterizations of admissible generalized 
Bayes estimators. 

2. In fact, Strawderman and Cohen [(1971), Sec- 
tion 4.5], explicitly tell us that the James-Stein es- 
timator cannot be generalized Bayes. 

3. In fact, the calculation leading to E(5|y) = 
V(k-2)/\y\ 2 is invalid. To see this note that, start- 
ing from (1), with A^U (—V, oo), the prior on is 



?| 2 /(2A) 



(2vr^) fc /2 



dA, 



and, even if we take k to be even to avoid complex 
integration, it is straightforward to verify that the 
integral over (— V, 0) is infinite. 

What does this tell us about the James-Stein es- 
timator? The "bad" part of the integral, which leads 
to the piece in (3) corresponding to P{xl — |y| 2 /^)> 
is to be avoided. We can informally interpret this as 
pointing to the region where |y| 2 /^ is small, result- 
ing in shrinkage factors that could be greater than 
1 (in absolute value), and result in the James-Stein 
estimator both changing the sign and expanding y. 
When we lop off this part, we are led to estimators 
such as the positive-part James-Stein estimator, or 
admissible estimators, like those based on (32) in 
Morris and Tang. 

2. MINIMAXITY OF ADM 

The lesson from the previous section is to avoid 
estimators that do not control the shrinker to be 
between and 1. So we turn to ADM and ask if it 
can do this. We find, interestingly, that the ADM ap- 
proach will, almost automatically, give us a minimax 
estimator and, moreover, it controls the shrinker. 

Typically, minimax estimators have been construc- 
ted using empirical Bayes arguments and a bit of 
customizing, or using formal Bayes derivations with 
priors like A ~ Unif(0, oo). The derivation of Mor- 
ris and Tang in Section 2.7 is a straightforward dif- 
ferentiation, and we can apply the following theo- 
rem. [This is Theorem 5.5, Chapter 5, Lehmann and 
Casella (1998), and can be traced back to Baranchik 
(1970).] 

Theorem 1. Under model (1), the estimator 



Vg(\y\) 



1. the function <?(|y|) is nondecreasing, 

2. 0< 5 (|y|)< 2(*-2). 

In the notation of Morris and Tang, we are con- 
sidering the case r = 0, and writing T = |y| 2 /(2V), 
the ADM shrinkage factor is 



(4) B 



1 

T 



2(m-c+l)T 



T + m + l + V(T 



m 



l) 2 + 4cT 



is minimax under the loss \6 — <5(y)| 2 if 



Morris and Tang note that B is monotone decreas- 
ing in T, but for minimaxity we need the function 
in square brackets, which corresponds to g(-) of the 
theorem, to be nondecreasing. As T — > oo, the func- 
tion converges to m — c + 1. If this is the maxi- 
mum, and is less than 2{k — 2), then the estima- 
tor will be minimax. In fact, it is straightforward 
(but tedious) to show that the derivative of the func- 
tion in square brackets is always nonnegative, so the 
function is nondecreasing and the estimator is mini- 
max. For c = 1 the bound can be satisfied by taking 
m = (fc-2)/2, for k > 3. 

Unfortunately, this is as far as we can go. The esti- 
mator based on (4), which is reminiscent of a ridge 
regression estimator, cannot be admissible. Again 
we can trace this back to Strawderman and Cohen 
(1971), and also Berger and Srinivasan (1978). The 
problem is that (being a bit informal here) admissi- 
ble estimators must be analytic in the complex plane 
which is not the case with those based on (4). 

Lastly, we wanted to see the risk performance of 
ADM. Morris and Tang set m=(k-2) /2, but there 
is actually a range of values of m for which the esti- 
mator is minimax. To clarify, denote (k — 2)/2 = m* , 
the Morris and Tang choice, and consider the m 
in (4) to be a variable. Then, for c = 1 the esti- 
mator is minimax for all m < 2(k — 2). In Figure 1 
we see, in the left panel, the risk of five ADM es- 
timators, along with the risk of the James-Stein 
estimator for comparison. There we see that the 
choice of m completely orders the ADM risk, with 
m* = (k — 2)/2 being the best choice, resulting in an 
estimator with risk similar to that of James-Stein. 
In the right panel we compare the ADM estimator, 
with m* = (k — 2)/2, to the James-Stein estimator, 
its positive-part version, and the admissible estima- 
tor with B given in (32). There we see that ADM 
compares favorably with the James-Stein estimator, 
is uniformly dominated in risk by the admissible 
estimator, but not by the positive-part estimator, 
whose risk crosses that of ADM for large \0\. 
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Fig. 1. For dimension fc= 10, the left panel shows the risk of the James-Stein estimator (dashed line) and five ADM 
estimators (solid lines), for m= (k — 2)/2, (k — 2)/4, (k — 2)/6, (k — 2)/8, (k — 2)/10. The risk function increases uniformly as 
the denominator increases, so m* = (k — 2) /2 (/roes the smallest risk. The right panel shows the risk of the ADM estimator with 
m* = (k — 2)/2 (solid), the James-Stein estimator (dashed), the positive-part estimator (dotted), and the admissible estimator 
with B of (32) (dash-dot). 



3. IS ADM AUTOMATIC? 

The automatic appearance of the ADM minimax 
estimator gives support to the claim of Morris and 
Tang that "ADM maintains the spirit of MLE while 
making small sample improvements." In fact, exam- 
ination of the ADM shrinker B, and its risk func- 
tions, shows that "automatic" ADM produces an 
estimator that does not shrink as strongly as either 
the admissible estimator or the positive-part and, 
hence, can have smaller risk for larger values of the 
norm of 0. It is not clear to us that such small sam- 
ple properties as minimaxity will continue to hold 
for other models, for example for the estimation of 
a Poisson mean, where many similar minimaxity re- 
sults hold. However, the results of Morris and Tang 
are encouraging and certainly deserve further inves- 
tigation. 
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